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ABSTRACT 


R.L.  Smith  (1989)  in  his  Statistical  Science  discussion  paper,  proposed  new 
methods  for  analyzing  extreme  values  based  on  the  point  process  view  of  high- 
level  exceedances,  and  illustrated  them  with  a  detailed  analysis  of  ozone  data 
from  Houston,  Texas.  The  methods  are  powerful  and,  in  particular,  the  point 
process  of  cluster  peaks  over  a  high  threshold  provides  a  remarkable  condensa¬ 
tion  of  the  massive  data  set  that  he  analyzes.  It  involves  little  loss  of  relevant 
information  and  permits  fairly  simple  analyses. 


Smith’s  conclusion  is  that  there  is  no  trend  in  the  overall  levels  of  the 
series,  but  that  there  is  a  marked  downward  trend  in  the  extreme  values.  It  seems 
-hard  to  find  physical  explanations Xor  this,  and  here  the  evidence  is  reassessed  in 
terms  of  a  comparison  between  competing  models  for  the  intensity  of  a  Poisson 
process.  This  suggests  that  there  is  some  evidence  for  a  decreasing  trend  in 
exceedance  rates  but  that  it  is  rather  weak.  If  there  is  a  trend,  it  seems  more 
likely  to  consist  of  a  fairly  abrupt  change  than  a  gradual  decrease.  The  possibil¬ 
ity  that  such  a  change  is  due  to  an  improvement  in  measurement  technology  is 
discussed.  The  possibility  of  long-memory  dependence  is  also  considered  and 
the  clustering  method  used  is  discussed. 
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1.  INTRODUCTION 

In  his  excellent  paper,  Smith  (1989)  has  synthesized  a  range  of  powerful  methods  for  the 
analysis  of  extreme  values.  The  point  process  of  cluster  peaks  over  a  high  threshold  provides  a 
remarkable  condensation  of  the  massive  data  set  that  he  analyzes.  It  involves  little  loss  of 
relevant  information  and  permits  fairly  simple  analyses.  The  methodology  is  sure  to  find  wide 
application. 

Nevertheless,  I  find  it  hard  to  think  of  physical  explanations  for  the  conclusion  that  there 
has  been  a  downward  trend  in  the  extreme  values  without  any  accompanying  decrease  in  the 
overall  levels  of  the  ozone  series.  Here  I  try  to  reassess  the  evidence  in  terms  of  a  comparison 
between  competing  models  for  the  intensity  of  a  Poisson  process.  The  analysis  suggests  that 
there  is  some  evidence  for  a  decreasing  trend  in  exceedance  rates  but  that  it  is  rather  weak.  If 
there  is  a  trend,  it  seems  more  likely  to  consist  of  a  fairly  abrupt  change  than  a  gradual  decrease. 
The  possibility  that  such  a  change  is  due  to  an  improvement  in  measurement  technology  is  dis¬ 
cussed.  I  also  consider  the  possibility  of  long-memory  dependence  and  discuss  the  clustering 
method  used. 


2.  ARE  OZONE  EXCEEDANCE  RATES  DECREASING? 


The  evidence  in  Smith  (1989)  for  decreasing  exceedance  rates  consists  mainly  of  the  fact 
that  the  estimated  trend  was  downward  in  all  the  models  that  incorporated  a  trend.  However, 
these  models  did  not  appear  to  fit  better  than  models  that  did  not  incorporate  a  trend.  For  exam¬ 
ple,  the  likelihood  ratio  test  statistic  for  splitting  the  data  was  16.6  with  18  degrees  of  freedom. 


This  may  be  due  more  to  the  large  number  of  degrees  of  freedom  than  to  the  absence  of  an 
effect.  It  might  be  worth,  for  example,  fitting  a  model  of  the  form  (4.1),  but  with  =  «;  +  [3ft,, 
where  5;  =  0  for  1973-80  and  5,  =  1  for  1981-86.  One  could  then  test  the  hypothesis  that  (3  =  0,  _ _ 


which  involves  only  one  degree  of  freedom  rather  than  18.  There  are  many  other  parsimonious  , 

possibilities.  '  c°d«9_ 
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N on- homogeneous  Poisson  process  models  for  exceedances 

Professor  Smith’s  conclusion  corresponds  to  a  decreasing  rate  of  occurrence  in  the  point 
processes  of  exceedances  above  high  thresholds.  This  process  was  not  fully  observed,  and  the 
proportion  of  time  monitored  varied  over  the  period,  increasing  gradually  but  significantly.  1 
therefore  expressed  times  of  occurrence  in  terms  of  monitored  time  since  the  start  of  the  data, 
rather  than  calendar  time.  Also,  ozone  levels  are  highly  seasonal.  I  estimated  the  seasonal  effect 
as  piecewise  constant  within  each  of  the  six  61 -day  periods,  and  deseasonalized  the  data  by 
transforming  the  time  axis  (Cox  and  Lewis,  1966).  The  resulting  series  ot  events  arc  shown  in 
Tables  1  and  2. 1  denote  by  T  the  period  of  observation  and  by  t  =  (r  \ , . . . ,  t„)  the  event  times. 

TABLE  1 

Times  of  occurrence  of  exceedances  above 
threshold  level  1 6,  in  monitored  days  from 
the  start  of  the  data,  deseasonalized 
(read  down  the  columns) 


62 

511 

913 

1442 

1760 

2168 

2890 

3550 

71 

539 

934 

1484 

1774 

2204 

2976 

3565 

88 

555 

1006 

1506 

1830 

2245 

3051 

3636 

114 

5ro 

1022 

1527 

1893 

2363 

3079 

3685 

122 

596 

1043 

1554 

1924 

2417 

3153 

3827 

146 

610 

1057 

1585 

1986 

2465 

3257 

3846 

184 

627 

1138 

1617 

2007 

2535 

3302 

3891 

219 

662 

1184 

1621 

2045 

2568 

3325 

235 

740 

1231 

1624 

2608 

3346 

285 

779 

1318 

1642 

■‘■>99 

2705 

3403 

327 

847 

1409 

1742 

2i 

2791 

3421 

If  there  is  no  trend,  the  data  in  Tables  1  and  2  are  very  nearly  from  a  homogeneous  Poisson 
process;  we  denote  this  model  by  Mq.  This  assumes  that  any  short-term  correlation  has  been 
removed  by  considering  only  cluster  peaks.  An  alternative  hypothesis  is  that  the  exceedance  rate 
has  been  decreasing  smoothly  and  gradually.  This  may  conveniently  be  represented  by  the  log- 
linear  Poisson  process,  M  i  :X(.s)  =  pe~^\  where  X(s)  is  the  rate  of  occurrence  at  time  ,v. 
Another  possibility,  suggested  by  the  splitting  of  the  data  in  Smith  (1989),  is  that  the  exceedance 
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TABLE  2 

Times  of  occurrence  of  exceedances  above 
threshold  level  20,  in  monitored  days  from 
the  start  of  the  data,  deseasonalized 
(read  down  the  columns) 


71 

327 

1057 

1624 

1986 

2122 

3079 

88 

555 

1442 

1642 

2417 

2791 

3550 

122 

847 

1484 

1830 

2465 

3846 

285 

913 

1585 

1893 

2099 

2608 

2976 

rate  decreased  fairly  abruptly  within  a  short  time  period.  This  may  be  represented  by  the 
change-point  Poisson  process.  Mi :  X(s)  =  A.!  if  0<s<T  and  X(s)  =  Xi  if  t<  s<T. 


Model  comparison 

The  three  competing  models,  Mq,  M  j  and  Mi,  may  be  compared  using  the  Bayes  factor,  or 
ratio  of  posterior  to  prior  odds  for  Mt  against  MJt  By,  for  each  pairwise  comparison.  It  has  been 
argued  that  Bayes  factors  are  better  measures  of  evidence  than  P  values  (Berger  and  S  like, 
1987),  and  they  are  also  more  readily  applicable  to  the  comparison  of  non-nested  models. 


With  vague  prior  information,  the  Bayes  factor  for  M o  against  the  log-linear  Poisson  pro¬ 
cess,  M  i ,  is 


B0i  =0.645(/i-l)/  |  e~Ry 
o 


dy. 


(1) 


where  /?  =^r,/T  (Akman  and  Raftery,  1986a).  The  Bayes  factor  against  the  change-point  Pois¬ 
son  process,  M  i,  is 


B  02  =  4^  T(n  +‘/2)  /  £  T(t  +'/z)  T(n  -i  +V2)  /,  (2) 

(=0 

“.,1 

(Raftery  and  Akman,  1986).  In  (2),/;=  j  x~(-l+'/2)  (l-jc)_<"~'+,/^  dx,  where  u,  =  vjT. 
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A  classical  test  siaiisuc  for  M„  against  M,  is  U  =  (R-'An)f \j ,  which  has  approx, 

mately  a  standard  nonnal  distribution  under  M o  (Cox  and  Lewis,  1966).  A  test  of  M  0  against 
M  2  may  be  based  on  the  quantity 


A  =  nl/2  max  {  |g  (i-1,  u,)|,  |g  (i,  w,)|  } , 
01<u,<99 


(3) 


where  g 


(i,  u)  =  i  “V  — ~  ~(n~i) “V -p— .  The 

'  u  ’  1  — li 


w  ’  1-u 

3.29  (Akman  and  Raftery,  1986b). 


approximate  5%  critical  value  for  this  test  is 


In  Table  3  there  is  evidence  against  a  gradual  decrease  in  exceedance  rate  of  ihc  form 
specified  by  M\.  The  posterior  odds  for  the  change-point  Poisson  process  are  2.7:1  and  2:1  at 
threshold  levels  of  16  and  20  respectively.  In  the  words  of  Jeffreys  (1961),  this  constitutes  evi¬ 
dence  against  the  homogeneous  Poisson  process,  but  it  is  not  worth  more  than  a  bare  mention.  In 
addition,  the  result  of  the  classical  test  based  on  (3)  is  not  significant. 

TABLE  3 

Model  comparison  results. 

B  oi  is  the  Bayes  factor  for  the  homogeneous  Poisson  process 
against  the  log-linear  Poisson  process  given  by  (1); 

U  is  the  corresponding  classical  test  statistic. 

B  02  is  the  Bayes  factor  for  the  homogeneous  Poisson  process 
against  the  change-point  Poisson  process  given  by  (2); 

A  is  a  corresponding  classical  test  statistic  given  by  (3). 


16 


20 


Iogio#oi 


1.69 


l.il 


U 


-1.02 


-0.97 


log  10^02 
A 


-0.44 

2.33 


-0.30 

2.15 


Threshold  level 
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Checking  the  homogeneous  Poisson  process 

Since  the  evidence  for  a  trend  appears  weak,  it  seems  worth  checking  the  homogeneous 
Poisson  process  model  itself.  One  way  of  doing  this  is  to  compare  the  observed  evolution  with 
those  of  several  data  sets  simulated  from  the  model.  Since  the  homogeneous  Poisson  process  is 
time-reversible,  we  can  do  the  same  with  the  time-reversed  data  set.  Ripley  (1977),  who 
pioneered  this  approach,  used  point  estimates  of  the  model  parameters  in  the  simulations,  but 
this  may  lead  to  simulated  bands  which  are  too  narrow.  Here,  uncertainty  about  the  parameter  X 
of  the  homogeneous  Poisson  process  is  incorporated  as  follows  (Rubin,  1984;  Raftery,  1988). 
First,  generate  a  value  of  X  irom  its  posterior  distribution,  taken  here  to  be  Gamma  (n+'A,  T), 
and  then  proceed  as  before. 

The  result  is  shown  in  Figure  1.  The  data  do  not  go  outside  the  simulated  bands  except  very 
briefly  in  Figures  1(b)  and  1(d).  Once  again,  the  evidence  against  the  homogeneous  Poisson  pro¬ 
cess  does  not  seem  strong.  The  simulated  bands  may  still  be  too  narrow  because  they  do  not 
take  account  of  uncertainty  about  the  estimated  seasonal  effect,  and  so  the  evidence  may  be  even 
weaker  than  it  appears. 

Analysis  of  the  change-point  Poisson  process  model 

The  analysis  so  far  suggests  that,  if  there  is  a  trend,  it  is  better  represented  by  a  change- 
point  than  by  a  gradual  decrease.  The  posterior  distribution  of  the  change-point  is  given  by 
equation  (2.3)  of  Raftery  and  Akman  (1986)  and  is  shown  in  Figure  2.  At  threshold  level  20,  the 
posterior  distribution  is  less  diffuse  than  at  level  16.  The  posterior  mode  is  April  1 1,  1984,  which 
is  at  the  beginning  of  the  1984  "ozone  season".  This  result  reflects  the  fact  that  there  was  only 
one  exceedance  above  level  20  in  each  of  1984,  1985  and  1986,  compared  with  28  such 
exceedances  in  the  previous  ten  seasons.  The  posterior  mode  at  threshold  level  16,  which  is  less 
marked  than  at  level  20,  is  June  25,  1981. 

The  analysis  here  is  tentative  in  many  ways.  In  particular  it  seems  important  to  include 
relevant  covariates,  especially  temperature,  as  emphasized  by  Davison  and  Hempill  (1986). 


Cumulative  number  of  cluster  peaks  Cumulative  number  of  cluster  peaks 
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Time  in  monitored  days,  deseasonalized 
(a)  Threshold  level  1 6 


Time  in  monitored  days,  deseasonalized 
(b)  Threshold  level  16;  data  reversed 


3 


Time  in  monitored  days,  deseasonalized 
(c)  Threshold  level  20 


Time  in  monitored  days,  deseasonalized 
(d)  Threshold  level  20;  data  reversed 


FIGURE  1.  Diagnostic  checking  for  the  homogeneous  Poisson  process.  In  each  graph,  the  solid 
line  represents  the  data  and  the  dotted  lines  are  the  outer  envelopes  of  19  simulations  from  the 
model,  as  described  in  the  text. 


Time  in  monitored  days,  deseasonalized 
(a)  Threshold  level  16 


Time  in  monitored  days,  deseasonalized 
Ch'  Threshold  IovpI  50 


FIGURE  2.  Posterior  distribution  of  the  change-point  in  the  change-point  Poisson  process 


model. 
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It  would  be  interesting  to  know  if  there  were  any  events  which  could  have  caused  an  abrupt 
change,  such  as  legislation,  changes  in  Federal  standards,  highway  development,  or  changes  in 
data  collection,  checking  and  reporting  practices. 

My  colleague  Peter  Guttorp  has  suggested  the  following  as  a  possible  explanation.  In  the 
past,  hourly  ozone  measurements  were  often  based  on  25-minute  averages.  Now  it  is  more  usual 
for  measurement  to  be  continuous,  so  that  hourly  measurements  are  based  on  60-iuinute  aver¬ 
ages.  Such  a  change  in  instrumentation  would  not  have  changed  the  overall  level  of  the  series, 
but  it  might  well  have  reduced  its  variability,  and  hence  the  exceedance  rate,  because  the  meas¬ 
urements  are  based  on  more  data.  This  would  have  been  an  abrupt  change  rather  than  a  gradual 
one,  and  so  seems  consistent  with  the  results  here  and  in  Smith  (1989).  It  would  be  interesting  to 
know  if  there  was  such  a  change  in  instrumentation  in  Houston  during  the  period  covered  by  the 
data.  If  so,  it  would  suggest  that  we  are  not  seeing  an  improvement  in  compliance  with  the 
Federal  standard,  but  rather  a  change  in  measurement  technology. 

2.  OTHER  ISSUES 

Long-memory  dependence 

Long-memory  dependence  is  known  to  be  a  feature  of  at  least  some  climatic  variables 
(Haslett  and  Raftery,  1989,  and  references  therein).  Climate  influences  ozone  levels,  so  it  is 
possible  that  ozone  levels  may  also  exhibit  long-memory  dependence.  This  is  characterized  by 
small  but  non-negligeable  autocorrelations  at  long  lags,  an  infinite  spike  at  zero  in  the  spectrum 
or  "cycles  of  all  periods",  and  high  variability  of  the  sample  mean  and  other  statistics.  It  is  hard 
to  detect  but  can  dramatically  affect  statistical  analyses. 

Is  there  any  evidence  of  long-memory  dependence  in  the  ozone  data?  How  could  it  be 
detected?  Would  it  affect  the  analysis  of  extreme  values?  If  so,  how  could  it  be  incorporated 
into  the  analysis? 
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The  clustering  method 

1’he  method  for  forming  clusters  used  by  Smith  (1989)  is  essentially  the  single  link  method. 
This  has  the  possible  disadvantage  that  two  clusters  six  days  apart  with  a  single  exceedance 
between  them  could  be  merged  (Gordon,  1981).  An  agglomerative  sum  of  squares  method  might 
be  preferable  given  that  the  aim  is  to  obtain  compact  clusters.  Inspection  of  the  dendogram  could 
help  with  the  choice  of  a  cluster  interval. 
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exceedances,  and  illustrated  them  with  a  detailed  analysis  of  ozone 
data  from  Houston,  Texas.  The  methods  are  powerful  and,  in  particular,  the 
point  process  of  cluster  peaks  over  a  high  threshold  provides  a  remarkable 
condensation  of  the  massive  data  set  that  he  analyzes.  It  involves  little 
loss  of  relevant  information  and  permits  fairly  simple  analyses. 


(cont.  on  next  page) 
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Smith's  conlcusion  is  that  there  is  no  trend  in  the  overall  levels  of  the 
series,  but  that  there  is  a  marked  downward  trend  in  the  extreme  values.  It 
seems  hard  to  find  physical  explanations  for  this,  and  he^e  the  evidence 
is  reassessed  in  terms  of  a  comparison  between  competing  models  for  the 
intensity  of  a  Poisson  process.  This  sugtests  that  there  is  some  evidence 
for  a  decreasing  trend  in  exceedance  rates  but  that  it  is  rather  weak.  If 
there  is  a  trend,  it  seems  more  likely  to  consist  of  a  fairly  abrupt  change 
than  a  gradual  decrease.  The  possibility  that  such  a  change  is  due  to  an 
improvement  in  measurement  technology  is  discussed.  The  possibility  of 
long-memory  dependence  is  also  considered  and  the  clustering  method  used  is 
discussed. 


